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Abstract 



In recent times, a considerable amount of work has been devoted to the develop- 
ment and analysis of gossip algorithms in Geometric Random Graphs. In a recently 
introduced model termed "Geographic Gossip," each node is aware of its position but 
possesses no further information. Traditionally, gossip protocols have always used 
convex linear combinations to achieve averaging. We develop a new protocol for Geo- 
graphic Gossip, in which counter-intuitively, we use non-convex affine combinations as 
updates in addition to convex combinations to accelerate the averaging process. The 
dependence of the number of transmissions used by our algorithm on the number of 
. sensors n is n exp(0(log log n) 2 ) = n. 1+ °W. For the previous algorithm, this depen- 

^5 ! dence was 0(n L5 ). The exponent 1+ o(l) of our algorithm is asymptotically optimal. 

J-J ' Our algorithm involves a hierarchical structure of log log n depth and is not completely 

decentralized. However, the extent of control exercised by a sensor on another is re- 
^ I stricted to switching the other on or off. 



1 Introduction 



Geometric Random Graphs have become an accepted model for wireless ad hoc and sensor 
networks. Due to applications in distributed sensing, a significant amount of effort has 
been directed towards developing energy efficient algorithms for information exchange on 
these graphs. The problem of distributed averaging has been studied intensively because 
it appears in several applications such as estimation on ad hoc networks, and encapsulates 
many of the difficulties faced in asynchronous distributed computation. Let vi, . . . ,v n be n 
points independently chosen uniformly at random from a unit square in M. 2 . A Geometric 
Random Graph G(n, r) is obtained from these points by connecting any two points within 
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Euclidean distance r. A Gossip Algorithm is an averaging algorithm that, after a certain 
number of information exchanges and updates, leaves each node with a value close to the 
average of all the originally held values. 

1.1 Related Work 

There is an extensive body of work surrounding the subject of gossip algorithms in various 
contexts. Here, we only survey the results relevant in a narrow sense to the question under 
consideration. 

Gupta and Kumar [I] gave conditions under which G(n, r) is connected with high probability 
(w.h.p.). It is sufficient that r scales as f2(y^jp) in order that G(n,r) be connected with 
probability greater than 1 — n~ e ^\ 

A distributed Gossip Algorithm for arbitrary graphs was presented by Boyd et al [1J. In 
this algorithm, when the clock of a sensor s ticks, s sends its value x s to a sensor v chosen 
uniformly at random from its neighbors, and receives the value x v of v. Thereafter s and v 
set their values to X *+ Xv . The dependence of the number of transmissions required by this 
algorithm on n is 0(n 2 ). The performance was related to the mixing time of the natural 
random walk on that graph. In fact they showed that if the connectivity graph is G, the 
number of transmissions made in the course of the algorithm is Q(nT mix (G)), where T mix (G) 
is the mixing time of G. 

In the standard framework for modeling sensor networks, n sensors are placed at random 

on a unit square □ and have a radius of connectivity r = ( y ^jp ) • One does not assume 
that a sensor possesses any information about its own location. In this model, the number 
of transmissions that the best known algorithm uses is 0(n 2 ) as described aboveQ 

A more powerful model was proposed by Dimakis et al [5], wherein each sensor is aware of 
its own location with reference to □ , but possess no further information. It is mentioned 
in [5] that this is reasonable in typical scenarios. With this model, by exploiting geographic 
information, they were able to provide an algorithm that requires 0(n L5 ) transmissions. In 
their algorithm, each node exchanges its value with the node nearest to a position chosen 
randomly on □, and both nodes replace their values by the average as in the algorithm of 
Boyd et al pQ. Rejection sampling is used to make the distribution roughly uniform on nodes. 
The routing takes 0(-\/n) hops w.h.p, but since the mixing time on the complete graph is 
0(1), one obtains an algorithm using 0(n L5 ) transmissions, which is an improvement over 
[lj by a factor of 0(\/n). 

A natural approach to obtaining more efficient algorithms would be to engage in long-range 

1 In using O, we ignore polylogarithmic factors and depending on context, the dependence on parameters 
other than n. 
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information exchanges less frequently than short-range ones. However, it appears that the 
benefit derived from an improved mixing time with long-range transmissions more than 
compensates for the additional cost in terms of hops for a long-range routing. Due to this 
fact, simply altering the probability distribution with which a node picks targets seems to 
be counterproductive. 

1.2 Our Contribution 

An afhne combination of two vectors a and b has the form aa+ (1 — a)h. Unlike the case of 
convex combinations, a need not belong to [0, 1]. We introduce counter-intuitive update rules 
which are affine combinations rather than convex combinations (with coefficients possibly as 
large as Vt(^/n)) to achieve faster averaging. The total number of transmissions used by the 
proposed algorithm in order that the £ 2 -distance of the output from the average diminish by 

o(l) 

a multiplicative factor of e w.h.p, is nexp(0((loglogn) log log ^)). When e = exp(n 1 °s 1 °s ri ) 
the number of transmissions is n 1+ °^\ The exponent l + o(l) is asymptotically optimal, since 
every node must make at least one transmission for an averaging algorithm to work. Like 
previous algorithms, ours makes packet exchanges with random nodes. Due to the instability 
introduced into the system by the use of non-convex combinations, for the present analysis 
to hold, a certain amount of control needs to be exercised and our algorithm is not truly 
decentralized. However, the extent of control exerted by any sensor on another is restricted 
to switching the other on or off. 

2 Preliminaries 

The standard model for a sensor network is as follows. We assume that each node or sensor 
has a clock that is a Poisson process with rate 1, and that these processes are independent. 
This model is equivalent to having a single clock that is Poisson of rate n, and assigning 
clock ticks to nodes uniformly at random. We assume that the time units are adjusted so 
communication time between any two adjacent nodes is insignificant in comparison with the 
length of an average time slot n -1 . Our algorithm involves packet forwarding when two 
non-adjacent nodes communicate. We shall assume that the time taken to forward a packet 
is also insignificant in comparison with and that a single packet exists in the network 
in each time slot w.h.p.. We assume some limited computational power, which amounts to 
memory of logarithmic size, and the ability to do floating point computations. 

For our purposes, a Geometric Random Graph is defined in the following way. Let vi, . . . , v n 
be n points independently chosen uniformly at random from a unit square in M 2 . A Geometric 
Random Graph G(n, r) is obtained from these points by connecting any two points within 
Euclidean distance r. 
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2.1 Problem Statement 



Let node Vi for i = 1, . . . , n hold a value Xi(t) at the t th global clock tick, the initial values 
being Xi(0). Without loss of generality, we assume x(0) = 0. Given e, 5 > 0, the task 
is to design an algorithm such that ||x(£)|| < e||x(0)|| for all possible choices of x(0) with 
probability > 1 — 5. The cost of the algorithm is the expected number of transmissions made 
until t. 

In the rest of the paper, we shall make the standard assumption that the radius of con- 
nectivity r(n) = 6(y^jp) (eg [5J.) Under this assumption, the probability of the graph 

G(n,r) being disconnected is D,(n~°^), for an appropriate constant a. As a consequence, 
it is not possible to drive 5 below n~°^ . For this reason, in the analysis, we shall assume 
that S = n~ olyl \ On the other hand e can be made arbitrarily small by running the aver- 
aging algorithm for a sufficiently long interval of time. In this paper, we shall assume that 

i °(i) 

log - = n l0 <s lc « n . This does not allow e to be exponentially small but permits it to be the 
reciprocal of a quasipolynomial. A sufficiently large constant a will appear in the parameters 
of our algorithm described later. When we use the term high probability, we shall mean with 
probability 1 — n~ e<yl \ 



3 Overview of Algorithm 

Let □ be the unit square in which the n sensors are randomly placed. Let the initial values 
carried by sensors be Xi(0), for i = 1 to n. We consider a partition of □ into ~ n l l 2 smaller 
squares Let contain #(Dj) sensors. Let time(n) represent the expected number of 
transmissions until ||x(£)|| < e||x(0)|| w.h.p., where e is some function of n that we shall not 
investigate at the moment. Suppose that we had a "nearly perfect" averaging protocol A on 
the smaller squares i. e. when A is run on each square, after t = time^(^/n) transmissions, 
within Dj the values are for practical purposes equal to the the average of the original values. 
That is, 

E ar.(0) 

(Vi)(Va e Di)x a {t) - se ^ , . 



Definition 1 For each square let s(Dj) be the sensor closest to the center o/Dj. 

This can be determined by each square, using a constant number of transmissions w.h.p. 

The s(LTj) exchange values among themselves by Greedy Geographic Routing (see [5]). 

Consider the following protocol. Suppose that A has been run on each subsquare of the form 
□i independently, and the values carried by the nodes within are all equal. When s(Dj) 
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becomes active, the following round takes place. 



1. Si :— s(dj) picks a square Dj uniformly at random. s$ geographically routes a packet 
with its value to Sj := s(\3j). 

2. Sj routes its own value to S{ by greedy geographic routing. 

3. x Si < x s . H ^— (x Sj — x s .). 

4. X S j < -|- g (^Si ^"s,)- 

5. ^4. is independently run on (the process being activated by Sj by switching certain 
nodes on) and on Dj (initiated by Sj similarly). 

6. A is ended on square Dj by s$ (by turning certain nodes off), and A is ended on Oj by 
Sj (by switching certain nodes off.) 

Now, let Zi(t) := x s (t). Without loss of generality, we assume that ^ Xi = 0, since this 

only adds a constant offset and does not affect the rate of convergence. An application of 

< w.h.p . If we examine the evolution 



the Chernoff Bound tells us that (Vz) 
of z, we see that after a round of the kind described above 



#(□0 _ i 



• Zi(t) = (1 - «i)^(t - 1) + OLjZj{t - 1) 

• Zj(t) = (1 — aij)zj(t — 1) + diZiit — 1) 

where Vi, G (5, |). From Lemma [H it follows that 

E[||z(t)|| 2 ] < (1- 2^)*||z(0)|| 2 . Roughly speaking after 0(y / nlog(^)) of these steps, we have 
a distribution x(t') such that ||x(£')|| < e||x(0)||. 

Each geographical routing mentioned above takes 0{\/n) transmissions w.h.p (see [5]). Also, 
each process of initiating or ending A on a square takes 0{\/n) transmissions. 

So, the total number of transmissions with n nodes time(n) satisfies a recurrence of the form: 
time(n) ^ O (y/n\og(^){time^(^/n) + 0(y/n)) \ . 



Ignoring the dependence on e, it would allows us to recursively define the algorithm A on 
□, for which time^in) = nexp(0(loglogn) 2 ). 
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4 Description of the Algorithm 



4.1 Notation 

The square □ is partitioned into n\ subsquares D iy where n x is the nearest integer to ^Jn 
that is the square of an even number. For a square □j 1 ...j r , let E^D^...^ denote the expected 
number of sensors within □ il _j r . Then, while E^D^ ir > (logn) 8 , 

the square □i 1 ...i r is partitioned into n r+ i subsquares □j 1 ...i r+1 , where n r+ i is the nearest 
integer to y^E^LT^TX that i s the square of an even number. Let 



i.e. the number of levels in this recursion. Given a square □<«>, let s(D<i>) denote the 
sensor nearest to its center. By our construction, these centers are well separated, and any 
sensor has this property with respect to at most one square w.h.p.. We shall denote this 
by D(s). We assign a Level to each node by the following rule: If s = s(Di 1 .„i r ), s has 
level £ — r. These nodes are have Levels 1, ...,£. There is a single root node at Level £, 
namely s(D). The nodes at Level are the nodes not of the form s(Dj 1 ...j r ). In the informal 
discussion earlier, we did not concern ourselves with the error in the averaging carried out 
on subsquares Dj. However, these errors propagate up the hierarchy rapidly, and hence it 
is necessary to obtain results with greater accuracy in smaller squares. Thus we define the 
desired accuracy recursively. Let e r be the accuracy for the averaging process in a square 
Dii.-.v-i- Lemma [2] tells us that it is sufficient to take e r , to be p l\~^ for a polynomial of 
sufficiently large degree. 

Let e = e, 5o = 5. We recursively define e r+ ± := — ^7— and <5 r+ i = 4fer- 



We define time(n, £ — 1, e r , S r ) to be ^(log ^-) \og(5 e } 1 )J . Thereafter, we define time(n, r — 
1, ep_i, 5 r _i) := time(n, r, e r , 5 r )n a (log(^) log(5 r X )J . 
Let s e 

4.2 The Protocol 

Every node s has two states, a local. state and a global. state, both of which are initially 
= 0//, but can also take the value on. Each node s possesses a private counter counter (s). 
During initialization, the global. state of s(D) is set to on but every other global. state is 0. 
The local. state of a// nodes is set to off at this juncture. 

Let us suppose that the clock of s ticks. We describe the protocol followed by it below. We 
consider two cases. If s is at Level 0, it obeys the following protocol: { 



£ := 1 + sup r, 

^Hl...i T 
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1. If local. state(s) = on 
Near(s); 



} 

Near(s){ 

1. s picks an adjacent node v contained in uniformly at random. 

2. ssets x s (t + l) = ^teW ; 
u sets x„(t + 1) = ^IteW ; 

} 

We next describe the protocol if s is at a Level greater than 0. The subroutine Near is the 

same as above. Let □(«) =: ir . 

{ 

1. If global. state(s) = on 

(a) If counter(s) = Activate. square(s); 

(b) With probability n~ a time(n, r, e r , cV) -1 

• Far(s); 

• counter(s) <— 0; 

2. If local. state(s) = on 
Near(s); 

3. If counter(s) > time(r,n, e r , S r ) Deactivate. square(s); 
Else counter(s) <— counter(s) + 1; 

} 

Far(s){ 

1. s picks a square ...ij. ^ s uniformly at random. Let s' := s . Node s routes 
its value to s' geographically. 

2. x s (t + 1) = z s (t) + l(E # n h ... ir x s ,(t) - E # D il ... ir a; a (*)). 

3. s' sends back to a packet with its value av(t) to s by greedy geographic routing. 

4. Node s computes x s (t + 1) = + |(E # n il ... ir x s /(t) - E # D il ... ir x s (t)). 
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5. counter(v) <— 0. 



} 



Activate. square(s){ 

1. If s G Level 1, send packets to each node s' in D(s) setting local. state(s') <— on by 



2. If s G Level z > 1, send packets to each Level i — \ node s' in D(s) by greedy geographic 
routing, setting global. state(s') <— on. 



Deactivate. square(s){ 

1. If s G Level 1, send packets to each node s' in square(s) setting local. state(s') <— off 
by flooding. 

2. If s G Level % > 1, send packets to each Level z — 1 node s' in D(s) by greedy geographic 
routing, setting global. state(s') <— off. 



5 Analyzing the number of Transmissions 

Let H(n, r, e r , 5 r ) denote the number of transmissions used in our protocol in one round of 
ir , in order to diminish the variance (of the values carried by sensors in □ il- .. ir ) by a 
factor e r , with probability 1 — 5 r . 

Observation 1 In one round, i. e. the duration between s activating D(s) := Dtx...^ and 
deactivating D(s), £/ie number of long-range packet exchanges between sensors of the kind 



flooding. 



} 



} 




n = 



E#Pn...v] 



JE#Pn...i r ir 




S 



Each of these long-range packet exchanges is followed by a period of averaging within the 
involved subsquares, and this takes H(n, r + 1, e r+ i, 5 r ) = O(n) transmissions. Thus we have 
the recurrence 




As mentioned in subsection 14.11 we let eo = e, So = 5 and recursively define e r +i := 25 ^ r 7/2 and 
$r+i = ffi- For these parameters, 5 r = ^( pol y( n ) ), since 5 = ^( poly ( n ) ) and the n telescope. 
e r = eo{ln~°( loglogn ^ since £ ~ log log n. Now, the smallest squares that we create have 
O(polylogn) sensors each w.h.p. Since the ordinary averaging that we do there (described 
by the procedure "Near(s)") has an averaging time that is quadratic [HE], H(n,£,e£,Si) = 
f2(polylog(^)). And so using the recurrence for H and telescoping, we see that the total 
number of transmissions is 

H(n,0,e ,6 ) = (H(n, £, e r+1> S r+1 )) J] { 1 log - j 

= n(log^)° (loglogn) . 
This is n 1+ °^ if e = exp(— n lo s lo g™), and 5 = n~°( l \ 



6 Notes on Correctness 



In the algorithm proposed in this paper, each square D(s) has a certain latency, which is 
the averaging time restricted to that square. In order for our algorithm to be correct, we 
require that D(s) be undisturbed by the long-range exchanges that s is involved in, during 
this period. This is not a condition that can be imposed without the long-range exchanges 
of s losing their i.i.d property, which is crucial in our analysis of convergence. In order to 
retain this, and have an algorithm that is successful w.h.p we have set the rates at which 
long-range exchanges of s occur to be lower than the inverse of the latency by a factor 
n a . As a consequence, w.h.p, in the course of the entire algorithm, there are no long-range 
transmissions made by any node s while D(s) is active. The only issue that we have not 
dealt with in detail is of showing that our choice of errors e r achieves the desired end. This 
follows from Lemma [2] interpreted as follows: The nodes % represent subsquares □j 1 ...i r j r+1 of 
□^...i,. and the yj(t) for different j represent the sum of the values held by the nodes in a 
subsquare □j 1 ...j r j after t long distance transmissions between subsquares since the activation 
of □i 1 ...j r . We set e := e r+ i||x(0)||. The perturbations n{t) represent the errors generated 
from imperfect averaging within these subsquares. 
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7 Concluding Remarks 



We introduced non-convex affine combinations, in our averaging protocol in order to accel- 
erate Geographic Gossip in Geometric random graphs. The number of transmissions used 
in the course of our protocol is n 1+ °^ . This exponent is asymptotically optimal. Our algo- 
rithm, unlike the previous one in [5] is not completely decentralized. However as far as we 
can see, this is not a necessary feature associated with the use of affine combinations. 

8 Future Directions 

It would be interesting to study whether affine combinations can be used to develop a 
completely decentralized algorithm for Geographic Gossip that is also energy efficient. 
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A Appendix 

Let K n be the complete graph on n vertices {1, . . . , n}. Vi, let G (|, |). At time t > 0, for 
% — 1, . . . , n, let node % hold the value Xi(t). Consider the following update rule. If the t th 
clock tick belongs to node i, then, % chooses a node j uniformly at random, and the following 
update occurs: 

• Xi(t) = (1 — ai)xi(t — 1) + ajXj{t — 1). 

• Xj(t) = (1 — Ctj)Xj(t — 1) + OtiXi(t — 1). 

Lemma 1 E[x(t) T x(t)] < (1 - ^)*x(0) T x(0). 

Proof:Let the update rule for x(t) be given by A(t — 1), i. e. x(i) = A(t — l)x(i — 1). Note 
that A(t — 1) = I — (ttjej — ajej)(ej — ej), if the i th vector of the standard basis is denoted 
by e;. 



E[x(t) T x(t) \x.(t - 1)] = E[x(t - l) T A{t - l) T A{t - l)x(t - 1) |x(t - 1)] 

= x(t-l) T E[A(t-l) T A(t-l)]x(t-l). 

Let ct^e— ttj-ej = ande— = e^-. Then, E[A(t-l) T A(t-l)] = E[(/-e ij Q!^) T (/-e ij Q;^)]. 



Let Eij denote the n x n matrix whose ij th entry is 1 and every other entry is 0. 



Then, by expanding, one finds that 



nA(t-lfA(t-l)] = / + Z (l-^-l gii + E (l-(l-2 a ,)(l-2 aj )) g , 



_ 1 x H T _ (l-2a)(l-2a) T (1 - 2a l fE v , 

^ n-Y n(n-l) ra(n - 1) ^ n-1 
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An application of the formula for E[x(t) T x(t)|x(t — 1)], now gives us the following: 



E[x(t) T x(t)|a;(t- 1)] = E[x(t- l) T A(t - l) T A(t - l)x(t - l)\x(t - 1)] (1) 

= x(t- l) T E[A(t- l) T A(t- l)]x(t- 1) (2) 

We know that Vi, 1 - 2c* e (0, §). 

Let us upper bound x(t — l) T E[A(t — 1) T A(t — l)]x(t — 1) using the the expression for 
E[A(t - l) T A{t - 1)] derived earlier. 

x(t - If 1(1 - -^—)x{t - 1) = (1 - -^r)\\x(t - l)f, 

x(* - l) T ll T a;(t - 1) _ 



n- 1 

x(t - 1) T (1 - 2a)(l T - 2a T )x(t - 1) 



< 



n(n — 1) 

and, 

Wf-DII 2 



X ( t -l)'[>^—^)*( t -l)< g(n _ 1( . 

Adding up the above inequalities, 

E[x(t) T x(t)\x(t - 1)] < (l - ^^y) x(t - lfx(t - 1). 
As a consequence, 

E[|K*)|| 2 1 *(*-!)]< Mt-l)f. 



2nJ 

Successively conditioning on x(t — 2), . . . , x(0), we see that 

E[||x(t)f]< fl-i-V||x(0)|| 2 . 



2ra/ 

This proves the lemma. □ 
An application of Markov's inequality gives us the following corollary. 

Corollary 1 

P(||x(t)||> e ||x(0)||)< e - 2 (l"^ V 
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Proof: 



¥(\\x 



>e\\x(0)\\) 



P 



\x 



> t 



< e~ 2 E 



x(0W 
||x(t)|| 2 



(Markov's inequality) 



2n J 



An application of Markov's inequality gives us the following corollary. 



□ 



Corollary 2 



P(\\x 



> e\\x 



< € 



1 - 



2n 



We now consider a modified update rule, and prove a lemma similar to Lemma [TJ 

Let K n be the complete graph on n vertices {1, . . . , n}. Wi, let G (|, |). At time t > 0, 
for i = 1, ... ,n, let node i hold the value Xi(t). Let n(0),n(l), ... be a sequence of real 
numbers. Consider the following update rule. If the t th clock tick belongs to node i, then, i 
chooses a node j uniformly at random, and the following update occurs: 

• yi {t) = (1 - ai) yi (t - 1) + a jyj (t - 1) + n(t - 1). 

• yj(t) = (1 - - 1) + atViit - 1) - n(t - 1). 



Lemma 2 Suppose that for each t, \n(t)\ < e, and that a > 0. Then, 
P 



|y(f)||>n* ((l-^HyWH + Sv^n 3 ^ 
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< — . 



Proof:y(t) = A(t - l)y(i - 1) + n(t - 1), where = I - (a^ - - ej), and 

n(t - 1) = n(t - l)(ej - a,-). Let x(0) = y(0), and let the x(t) satisfy x(t + 1) = A(t)x(t) as 
in Lemma [TJ We observe that 

y(l) = x(l) + n(0) 

and more generally, 

t-i 

y(t + 1) = x(t + 1) + n(t) + ^ A(t)A(t - 1) . . . A(i + l)n(i). 

i=0 
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An application of the triangle inequality now gives us 

t-i 

\\y(t + 1)|| < ||x(t + 1)|| + ||n(f)|| + £ U(t)A(t - 1) . . . A{i + l)n( 



8=0 



Our approach to proving this Lemma is to upper bound each term in the right hand side. 



Observation 2 

P 



1 



x(t)||>(l--)*/V/ 2 || a ;(0)|| 



< ((l-I)V Effe „, 

~ 2n J VlK°)ll 2 



< 



(1 _ ±.y/2 n a/2 \ (1 _ J_ )t 

V 2n ! K 2n ! 



-2 



1 

n a ' 



The above inequalities follow from Lemma [T] and Corollary [2j We shall now upper bound 
the other terms as well with high probability Using Corollary [2] 
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We next observe that 



As a consequence we have 
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Observation 3 
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Once we put the above two observations together and note that (Vi)\/2e > ||n(z)||, an 
application of the union bound gives 
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